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First of all, we would like to thank the discussants 
for the care and thoughtfulness that they have taken 
in preparing their comments. 

Koehler presents a helpful discussion, putting for- 
ward a number of different ideas that generalize the 
approach taken. A taxonomy for technical system 
elicitation would provide useful guidance for practi- 
tioners and serve to codify applicable assumptions 
during the different systems engineering phases. Al- 
though more research is needed here, one could see 
the emergence of international standards that rely 
on such a taxonomy. 

We acknowledge that the elicitation problem varies 
greatly depending on the technical system as pointed 
out by Koehler and we have sought to generalize 
our experience in studying complex systems, includ- 
ing aerospace, rail and naval for both commercial 
and defense markets. This explains our bias toward 
the "closed loop" case. We agree with the two ex- 
tra areas of expert elicitation identified for "wa- 
terfall" cases: lack of expertise continuity and the 
problem of "forward casting" requirements for an 
existing system. Both of these relate to discontinu- 
ous changes in system operation. Such changes have 
occurred most obviously in military systems and 
other projects with long lead times. However, in 
the commercial world, such discontinuities can be 
forced by regulatory or market changes, or by out- 
sourcing decisions. These may make historic data 
collection taxonomies less relevant to the reliability 
questions posed to support new operational deci- 
sions and, therefore, provide new areas of applica- 
tion for expert judgement techniques. 

The final point raised by Koehler about the diffi- 
culties imposed by system complexity is well made 
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and the notion of multiple concurrent reliability mod- 
els is intriguing. This does partially link into the no- 
tion of expert weighting. However, it also requires a 
good understanding of the notion of model "exper- 
tise" as distinct from expert "expertise." One might 
argue that if sufficient understanding exists to be 
able to quantify model expertise, then one should 
be able to directly build a meta model that incorpo- 
rates the best of each model. In practice, the need 
to be cost-efficient will usually mitigate against such 
a strategy, and model combination is an interesting 
alternative. 

Wang rightly observes that we have not tried to 
give a survey of expert judgement methodologies. 
The main reason for this is that several surveys have 
been undertaken, including a recent one with a wide 
coverage (Jenkinson, 2005). It has not been our pur- 
pose to survey these methods again. Instead we aim 
to discuss the context in which such models may be 
used in the engineering design process and to show 
that the expert problem in this context frequently 
is more demanding than a "straightforward" proba- 
bility elicitation. 

Having said this, Wang is right to identify em- 
pirical Bayes (EB) as an interesting method with 
potential application in the area under discussion. 
There is, however, more than one way to utilize this 
approach. The approach discussed by Wang explic- 
itly uses expert information as data, hence forcing 
the analyst to choose priors and likelihoods for the 
expert data given the parameters. This is a funda- 
mental problem because it forces the analyst into 
the role of meta expert. In this case, the specifica- 
tion of p(x\Q) is going to be problematic whether or 
not we use EB. In our own work with EB (Quigley, 
Bedford and Walls, 2006, 2007) we have integrated 
expert judgement into the approach through the 
selection of pools that comprise different types of 
events whose data are merged in the EB process. 
The use of EB allows us to increase the quantity of 
data available to make estimates of reliability pa- 
rameters through expert judgements about which 
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events should have similar order of magnitude be- 
havior. 

Wang's proposal for using evidential reasoning in 
reliability combines a number of different question- 
able features. For the purposes of this rejoinder, 
we propose distinguishing three different issues con- 
tained in the discussion: 

• Nonprobabilistic representations of uncertainty. 

• Imprecise uncertainties. 

• Multicriteria decision models. 

Nonprobabilistic representations of uncertainty: We 
are yet to be convinced that these play a useful role. 
The examples we have seen discussed — both exam- 
ples to show the limitations of probability and exam- 
ples to show the need for a more general framework — 
are marred by lack of clarity about the underly- 
ing problem being modeled. Indeed this sometimes 
seems to be the point of the "need" for something 
else. In many cases more attention paid to struc- 
turing the problem and articulating the reasons for 
modeling will surely take care of many of the ambi- 
guities. To paraphrase O'Hagan and Oakley (2004), 
who recently wrote a paper titled "Probability is 
perfect, but we can't elicit it perfectly," we might say 
that "probability is perfect, but we find it difficult to 
apply appropriately." Such difficulties are even more 
apparent when applied to more complex generaliza- 
tions of probability. The danger is that theoreticians 
use such methods fix to avoid resolving impor- 
tant modeling issues. 

Imprecise uncertainties: There is growing inter- 
est, and some sound foundational work, in the area 
of interval probabilities. Such quantities may have 
a real and useful application, particularly in bound- 
ing probabilities of undesirable events. See, for ex- 
ample, Coolen, Coolen-Schrijner and Yan (2002), 
Coolen and Yan (2003), Coolen (2004, 2006), Au- 
gustin and Coolen (2004) and Coolen and Coolen- 
Schrijner (2005). 

Multicriteria decision models: It is important not 
to confuse such models, which in the first instance 
are designed to represent trade-offs between differ- 
ent attributes of a decision consequence, with proba- 
bilistic models that represent system and knowledge 
relationships. In the case of the motorcycle men- 
tioned in the discussion, the motorcycle is modeled 
most simply as a series system in the subsystems 
mentioned. The discussion of this example seems to 
force the analyst down a more complex route that ig- 
nores the basic engineering structure of the system. 



Furthermore, so many elements of the calculation 
appear to be arbitrary — for example, what is the 
event "that the ith basic attribute supports the hy- 
pothesis that the general attribute is assessed to the 
nth grade" that is being ascribed a probability and 
why should weights from Saaty's analytic hierarchy 
process be used to multiply probabilities? — that it 
is difficult to see that this leads to something re- 
ally meaningful and of more use than other simpler 
rule-of-thumb evaluations. 

The experience of Fenton and Neil in develop- 
ing Bayesian methods, especially Bayesian networks, 
adds valuable support to many issues raised in the 
paper. We would certainly acknowledge that TRACS 
is an early example of a meta modeling system of the 
type we discuss and it is good to hear that model 
building in its more recent developments is faster. 
Unfortunately, because these are commercial sys- 
tems, it is difficult for academics to be able to make 
judgements about the internal workings of the sys- 
tems. 

We agree with the point raised by Fenton and Neil 
that the customer can be an expert, as well as client, 
because it will often be the case that the customer 
possesses expertise about, for example, the opera- 
tional environment and maintenance of the family 
of systems. Hence the boundaries between the man- 
ufacturer and customer classes in Table I should 
be taken as an example of typical stakeholder roles 
rather than fixed allocation appropriate for all 
systems. In those cases where the customer has dual 
roles, additional care is required to manage bias that 
arises due to the levels of trust. Our limited experi- 
ence to date in working with teams that span stake- 
holder classes has been mixed: we have experienced 
a lack of openness in some situations, while in others 
we enjoyed a sharing in both directions motivated by 
the need for a useful decision support tool. The pres- 
ence of trust will be influenced by the culture of the 
companies involved as well as the expected longevity 
of the relationship. The awareness and management 
of subjective bias is important, but we agree that it 
should not be regarded as a reason not to conduct 
Bayesian modeling. 

In the absence of much relevant empirical data, 
Fenton and Neil point out that reliability assessment 
can be regarded a "black art." Certainly, Bayesian 
modeling can help to make assumptions more trans- 
parent. However, to some extent this simply brings 
with it a shift of difficulty from one area of model- 
ing to another. The parties have to find some level 
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of agreement on prior distributions, which can be 
problematic if the parties really understand the sig- 
nificance of the choice being made. 

Fenton and Neil give examples of the use of ex- 
pert elicitation within six-sigma approaches. This 
is noteworthy given that many reliability problems 
arise from systematic design variation due to man- 
agement as well as technical considerations. Despite 
the strong relationship between reliability and qual- 
ity, culturally they can be disparate within organi- 
zations. By focusing on failure mode identification 
and tracking, we have experienced limited success 
in conceptually reeliciting priors for reliability mod- 
eling using production experience (Walls, Quigley 
and Marshall, 2006). The reasons for only limited 
success can be partially attributed to common pro- 
cess drivers identified by the aerospace companies 
involved in modeling. For example, the difficulties of 
using standard data-driven statistical process con- 
trol for low-volume manufacturing has facilitated 
rather than hindered the acceptance of elicitation. 
However, we emphasize that the conceptual accep- 
tance by stakeholders as evidence of success in use 
currently remains scarce. Hence the research ques- 
tions posed concerning cultural conflict, organiza- 
tional drivers and process drivers are important to 
address issues for which only piecemeal anecdotal 
evidence currently exists. 

We would like to clarify to Fenton and Neil that 
we are not assuming implicitly or otherwise that the 
benefits of probability elicitation only accrue in sit- 
uations where there is already a highly developed 
reliability methodology and we do agree that elici- 
tation plays a distinctive role in organizations where 
it is not cost-effective to collect empirical data. How- 
ever, in situations where a highly developed reliabil- 
ity culture exists, there is a critical need to structure 
the models being quantified, and the users will cer- 
tainly benefit from that structuring phase, as well 
as the later quantification. 

Fenton and Neil point out that the "additional 
key benefit" of this kind of probability elicitation 
in terms of providing codified information for fu- 
ture systems is one that is certainly of importance 



in those industries with very short development cy- 
cles. For systems with longer cycles, there is time to 
collect operational information to update or replace 
the expert derived data, and industry "generic data 
bases" play the role discussed. 

We are grateful to the discussants for their com- 
ments, which provide further insights into many is- 
sues raised in the paper and contribute a number of 
new ideas that were not explored within the original 
paper. 
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